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Executive  Summary 

BACKORDER  ESTIMATION  UNDER  MULTIPLE  FAILURES 
OF  LOWER  INDENTURE  ITEMS:  A  TECHNICAL  NOTE 


Most  multi-echelon,  multi-indenture  stockage  models  used  by  the  Military 
Services  and  industry  are  extensions  of  the  basic  Multi-Echelon  Technique  for 
Recoverable  Item  Control  (METRIC)  model.  Those  models  assume  that  the  failure  of 
an  item  is  due  to  the  failure  of  one  and  only  one  next  lower  indenture  item,  although 
more  than  one  lower  indenture  item  failure  is  observed  in  many  real-world  situa¬ 
tions.  Under  such  assumptions,  the  METRIC  theory  overstates  backorders,  often 
dramatically,  and  that  overstatement  results  in  a  misallocation  of  spares  budgets. 

We  consider  two  types  of  failure  detection  mode  for  lower  indenture  items: 
simultaneous  and  sequential.  For  both  problems,  we  developed  mathematical 
models  that  produce  good  lower  and  upper  bounds  on  the  true  solutions. 
Interpolation  formulas  based  on  univariate  regression  provide  an  estimate  of  the 
true  solution.  In  the  simultaneous  detection  problem,  METRIC’s  more  than 
300  percent  average  absolute  error  has  been  reduced  to  less  than  4  percent  on  a 
sample  of  120  simulation  cases;  in  the  sequential  problem,  the  error  is  reduced  from 
9  percent  to  6  percent.  The  maximum  errors  are  reduced  more  dramatically.  The 
new  analytic  procedures  are  easy  to  calculate,  and  they  should  be  relatively  easy  to 
incorporate  into  computer  programs  for  spares  optimization. 
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CHAPTER  1 


INTRODUCTION 


OBJECTIVE 

The  objective  of  this  research  is  to  find  easily  computed  approximations  to  the 
expected  backorders  (EBOs)  when  multiple  failures  of  lower  indenture  items  can 
occur.  The  average  of  the  absolute  percent  error  over  a  large  number  of  cases  should 
be  low,  but  the  maximum  percent  error  should  also  be  acceptable. 

RATIONALE 

Approximation  techniques  for  the  multiple  failure  problem  are  valuable  for 
three  major  reasons: 

•  When  an  item  is  repaired,  more  than  one  next  lower  indenture  item  is  often 
also  repaired  or  replaced. 

•  Current  multi-indenture  stockage  models  in  the  Multi-Echelon  Technique 
for  Recoverable  Item  Control  (METRIC)  [1]  family  of  models  dramatically 
overstate  backorders  when  multiple  failures  occur;  errors  in  excess  of 
100  percent  are  common.  This  overstating  helps  to  explain  why  the  models 
almost  always  predict  availabilities  lower  than  those  actually  achieved  in 
the  field. 

•  The  assumption  of  one  and  only  one  lower  indenture  failure  is  the  reason 
that  a  simple  analytic  solution  for  multiple  failures  is  obtained  in  multi¬ 
indenture  METRIC  models.  The  analytic  solution  for  multiple  failures  is 
extremely  complicated  because  the  lower  indenture  item  backorder  compu¬ 
tations  are  no  longer  independent. 

PROBLEM  DESCRIPTION 

Our  description  is  presented  in  terms  of  two  indentures:  a  first-indenture,  line 
replaceable  unit  (LRU)  and  its  second-indenture,  shop  replaceable  units  (SRUs)  at 
one  site.  That  description  simplifies  the  discussion,  and  the  results  can  be  extended 
to  more  indentures  and  echelons. 

The  SRU  failure  detection  process  can  proceed  in  two  very  different  ways: 
simultaneously  and  sequentially.  In  the  simultaneous  case,  we  assume  that  after 
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some  LRU  checkout  time,  a  diagnosis  of  all  failed  SRUs  is  made  at  a  point  in  time.  In 
the  sequential  case,  we  assume  that  after  some  LRU  checkout  time,  a  diagnosis  of 
the  first  failed  SRU  is  made.  If  a  spare  SRU  is  available,  it  is  installed  on  the  LRU 
and  the  testing  continues  to  find  the  next  (if  any)  failed  SRU.  However,  if  a  spare 
SRU  is  not  available,  the  diagnosis  of  the  next  failed  SRU  is  delayed. 

With  sequential  detection,  the  EBOs  are  much  larger  than  in  the  case  of 
simultaneous  detection.  The  natural  question  is  which  process  is  closer  to  reality? 

We  describe  the  Air  Force  failure  detection  process  for  LRUs  with  automatic 
test  equipment  (ATE),  as  performed  in  an  Aircraft  Instniment  Shop.  The  LRU  is 
placed  on  a  test  stand,  and  a  series  of  automated  tests  is  performed  until  a  failure  is 
detected  in  some  SRU.  A  replacement  SRU  is  installed,  and  the  automatic  test 
sequence  is  restarted  at  the  *^reak  point”  in  the  software  preceding  the  last  test  that 
failed.  A  spare  SRU  is  usually  available  for  installation,  because  a  mock-up  or  shop 
standard  is  available  and  good  SRUs  can  be  pulled  from  it.  Since  the  detection  of 
failed  SRUs  does  not  require  that  any  SRU  be  repaired  before  testing  can  continue, 
the  detection  process  is  approximately  simultaneous  (the  test  sequence  may  require 
24  to  36  hours,  but  that  time  is  included  in  the  model  as  the  time  for  LRU  checkout). 
The  assumption  is  that  after  the  LRU  has  been  completely  diagnosed,  any  SRUs  that 
were  taken  from  the  shop  standard  —  thus  creating  "holes”  -  are  replaced.  The 
LRU  is  not  considered  ready  for  service  until  the  SRU  holes  are  filled  and  the  LRU  is 
retested.  However,  the  shop  can  begin  to  repair  the  failed  SRUs  immediately. 

The  simultaneous  detection  scenario  is  a  little  optimistic,  of  course,  since  it  is 
possible  that  a  given  SRU  from  the  mock-up  will  be  required  in  two  or  more  LRUs  at 
the  same  time.  There  may  be  other  reasons  that  the  simultaneous  model  is  too 
optimistic,  particularly  on  LRUs  without  ATE.  In  some  cases  no  mock-ups  are 
available  and  thus  a  good  SRU  may  not  always  be  ready  for  installation.  For  those 
reasons,  we  model  both  types  of  detection. 

MULTIPLE  FAILURE  DATA 

Before  modeling  the  multiple  SRU  failure  problem,  we  introduce  some 
evidence  that  the  problem  really  does  exist.  In  a  recent  report  to  the  Air  Force  [2], 
we  examined  detailed,  hierarchical  data  for  10  LRUs  and  their  families  of  lower 
indenture  parts  (to  the  fifth  indenture  in  some  cases)  and  found  that  lower  indenture 
demand  was  often  much  greater  than  that  for  the  parent  item.  For  one  LRU  that 
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ratio  exceeded  lO.l  This  efTect  becomes  more  important  at  the  depot  and  as  one 
moves  to  lower  indentures. 


Another  source  of  multiple  SRU  failures  is  battle  damage  in  wartime  scenarios. 
It  is  quite  likely  that  more  than  one  SRU  located  close  together  will  be  affected. 
Unfortunately,  we  have  no  detailed  data  on  this  phenomenon. 

REPORT  ORGANIZATION 

We  address  the  simultaneous  detection  problem  in  the  next  section  and  the 
sequential  detection  problem  in  the  following  chapter.  In  each  case,  we  provide 
background  on  the  problem,  a  description  of  the  simulation  and  the  analytic  model 
used  to  compute  lower  and  upper  bounds  on  the  solution,  the  interpolation  procedure 
suggested  for  estimating  the  solution,  and  the  numerical  results  and  an  analysis  of 
them.  The  final  chapter  presents  our  conclusions.  We  also  provide  an  appendix 
showing  how  to  compute  Erlang  state  probabilities  for  any  mean  and  any  variance- 
to-mean  ratio  less  than  one. 


^The  ratios  of  total  SRU  demand  divided  by  LRU  demand  for  the  10  LRUs  were  10.21,  5.80, 
0.66,  0.08, 1.48,  0.75,  2.97,  0.55,  0.06,  and  0.00;  the  corresponding  ratios  for  the  depot  replacement 
factors  [including  second-indenture  economic  order  quantity  (EOQ)  items  as  well  as  reparables] 
were  240.94,  4.43,  0.44,  2.31,  23.10,  0.85,  14.97,  136.53,  0.34,  and  2.31.  The  latter  set  of  numbers 
should  be  higher,  but  we  know  from  an  examination  of  the  Illustrated  Parts  Breakdown  that  it  is 
overstated  in  some  cases  because  of  errors  in  the  parts  hierarchy  data.  In  particular,  the  value  of 
240.94  was  found  to  be  much  too  large  because  many  of  the  EOQ  items  listed  as  second-indenture 
are  really  lower  indenture  parts. 
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CHAPTER 2 


SIMULTANEOUS  FAILURE  DETECTION 


BACKGROUND 

Let  N  denote  the  number  of  SRUs  on  the  LRU.  When  the  LRU  fails,  there  is  a 
probability  p(i),  where  0  s  p(i)  ^  1  and  i  =  1, 2,. . .  N,  for  the  failure  of  each  SRU.  In 
other  words,  from  0  to  N  failures  of  SRUs  may  occur  whenever  an  LRU  fails.  We 
assume  that  the  SRU  failures  are  independent  of  each  other  (that  assximption  is 
probably  not  strictly  valid,  but  no  data  are  available  to  support  more  complicated 
assumptions). 

In  the  simpler,  single  SRU  failure  model,  the  LRU  Poisson  demand  process 
splits  neatly  into  N  independent  SRU  Poisson  processes  (assuming  that  the  p(i)’s 
stun  to  one  or  less).  Since  there  is  only  one  SRU  in  each  process,  the  EBOs  can  be 
computed  independently.  Furthermore,  under  the  ample  service  assumption.  Palm’s 
theorem  can  be  invoked;  it  states  that  the  EBOs  are  independent  of  the  shape  of  the 
repair  distribution  [3]. 

In  the  multiple  failure  case,  the  LRU  Poisson  demand  process  splits  into  2N 
independent  Poisson  processes  (each  of  the  N  SRUs  may  be  in  one  of  two 
states  —  good  or  bad).  For  a  particular  SRU,  Palm’s  theorem  still  applies.  However, 
since  several  SRUs  may  have  failed  simultaneously,  the  backorder  computations  for 
each  SRU  are  not  independent.  Suppose  no  spare  SRUs  are  available  when  the  LRU 
fails.  The  LRU  repair  cannot  be  completed  until  all  failed  SRUs  have  been  repaired. 
This  implies  that  the  waiting  time  and  the  EBOs  for  the  LRU  will  be  longer  when 
the  SRU  repair  times  are  more  variable  (Palm’s  theorem  does  not  apply  to  the  group 
of  SRUs).  Since  the  results  depend  on  the  shape  of  the  repair  distribution  for  each 
SRU,  we  must  decide  whether  the  variability  extremes  of  constant  or  exponential 
repair  distributions  are  reasonable  or  some  intermediate  distribution  is  preferable. 

The  2N  failure  processes  for  the  new  model  are  much  larger  than  N  (e.g.,  when 
N  =  5,  2N  =  32)  and  grow  very  rapidly.  The  computation  of  backorders  is  further 
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complicated  by  the  fact  that  stock  is  sometimes  available  for  one  or  more  of  the  failed 
SRUs. 


In  an  earlier  report  for  the  Air  Force  [4],  we  developed  methods  for  modeling 
the  impact  of  multiple  simultaneous  failures  and  built  that  capability  into  an 
evaluation  model.  The  basic  idea  was  to  divide  the  repair  of  an  LRU  into  a  set  of 
mutually  exclusive,  collectively  exhaustive  "processes,”  where  a  process  was  defined 
to  be  a  set  of  next  lower  indenture  SRU  failures.  We  applied  standard  multi-echelon 
theory  to  compute  the  probability  distribution  for  the  number  of  units  of  each  process 
in  repair.  From  those  probabilities  and  the  number  of  spares  on  each  SRU,  we  were 
then  able  to  compute  the  probability  distribution  for  the  number  of  EBOs  on  each 
SRU.  Then,  assuming  that  any  SRU  EBOs  are  consolidated  on  the  fewest  LRUs 
(cannibalization),  we  were  able  to  calculate  the  LRU  EBOs  and  availability. 

The  computational  procedure  is  straightforward,  but  highly  time-consuming 
when  more  than  four  or  five  processes  or  items  are  involved.  The  ability  to  perform 
those  calculations  is  useful  in  an  evaluation  model  but  would  not  be  practical  in  an 
optimization  model  where  it  would  have  to  be  performed  many  times.  Furthermore, 
the  capability  was  implemented  only  for  a  single  base  with  two  indentures. 

Another  limitation  of  that  model  is  that  each  item  in  a  process  must  have  the 
same  repair  time,  and  that  repair  time  must  be  a  constant.  For  all  of  those  reasons, 
the  analytic  calculation  used  in  that  evaluation  model  is  not  considered  further  in 
this  report. 

REPAIR  DISTRIBUTION 

In  the  multiple  SRU  failure  case,  the  LRU  EBOs  depend  on  the  shape  of  the 
repair  distribution  for  each  SRU.  Suppose  for  simplicity  that  all  stock  levels  are  zero 
and  that  each  SRU  has  the  same  mean  repair  time  of  1.  If  the  SRU  repair  times  are 
constant,  the  waiting  time  to  repair  the  SRUs  is  I  regardless  of  the  number  that 
failed;  if  they  are  exponential,  the  waiting  time  increases  with  the  number  of  SRUs 
as  shown  in  the  second  column  of  Table  2-1. 

Of  course,  constant  and  exponential  repair  times  represent  two  extremes  of  no 
variability  and  high  variability,  respectively.  By  using  an  Erlang  distribution  (the 
siun  of  k  exponential  variables),  we  can  obtain  results  between  the  extremes. 
Results  for  the  Erlang-2,  -3,  and  -4  obtained  by  simulation  are  shown  in  Table  2-1. 
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TABLE  2-1 


EXPECTED  TIME  UNTIL  THE  LAST  OF  N  SRU  REPAIRS  IS  COMPLETE. 
EACH  WITH  MEAN  1 


Number 

ofSRUs 

Probability  distribution 

Exponential 

Erlang-2 

Erlang-3 

Erlang-4 

1 

1.00 

1.00 

1.00 

1.00 

2 

1.50 

1.37 

1.30 

1.27 

3 

1.83 

1.60 

1.49 

1.43 

4 

2.08 

1.77 

1.63 

1.55 

5 

2.28 

1.90 

1.73 

1.64 

6 

2.45 

2.01 

1.82 

1.70 

7 

2.59 

2.10 

1.89 

1.76 

8 

2.72 

2.18 

1.95 

1.81 

9 

2.83 

2.26 

2.00 

1.86 

10 

2.93 

2.32 

2.05 

1.90 

The  Erlang  is  an  appealing  physical  model  since  one  can  visualize  repair  as  the  sum 
of  several  independent  activities,  such  as  test,  diagnosis,  repair,  and  retest.  If  each 
activity  has  an  exponential  distribution  vrith  the  same  mean,  the  total  time  has  an 
Erlang  distribution.  In  the  Erlang  distribution,  the  value  of  k  is  usually  considered 
to  be  an  integer  between  1  and  infinity  (constant  repair  time),  but  the  distribution  is 
defined  for  all  k  >  0  and  is  better  known  as  the  gamma  distribution. 

Table  2-1  shows  that  the  Erlang-4  gives  waiting  times  that  are  about  midway 
between  those  of  the  exponential  and  the  constant  (whose  waiting  time  is  one).  Since 
the  shape  of  the  repair  distribution  matters  and  raw  data  do  not  show  empirical 
distributions  for  repair  time  that  are  at  either  extreme,  we  will  use  the  Erlang-4 
values  as  a  reasonable  compromise. 

SIMULATION 

Simulation  is  used  to  estimate  the  LRU  EBOs  under  multiple  SRU  failures. 
LRUs  are  assumed  to  fail  in  accordance  with  a  Poisson  process.  When  an  LRU  fails, 
a  random  number  is  drawn  for  each  SRU.  If  the  random  number  is  less  than  the 
SRU  probability  of  failure,  that  SRU  is  deemed  to  have  failed.  In  that  way  it  is 
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possible  for  0, 1, 2 ...  N  SRUs  to  fail.  In  some  cases,  one  or  more  of  the  SRUs  in  the 
LRU  may  have  a  quantity  per  next  higher  assembly  (QPA)  greater  than  one.  Thus, 
more  than  one  unit  of  a  particular  SRU  may  fail. 

Repair  times  can  be  exponential,  constant,  or  Erlang-k.  Simulations  were  run 
for  periods  of  40,000  to  400,000  days,  depending  on  demand  rates,  so  that  acceptably 
precise  95  percent  confidence  intervals  around  the  LRU  EBOs  could  be  obtained. 

Our  primary  interest  was  to  estimate  LRU  backorders  under  the  assumption  of 
full  cannibalization  of  all  SRUs  (i.e.,  consolidation  of  SRU  shortages  into  the  fewest 
possible  LRUs).  Although  full  cannibalization  is  not  practiced  unless  it  is  necessary 
to  achieve  an  availability  target,  we  believe  it  provides  a  useful  target  for  manage¬ 
ment  concerning  the  highest  performance  level  achievable. 

The  simulation  was  also  run  under  the  assumption  of  no  cannibalization, 
where  we  assume  that  management  keeps  LRU  backorders  to  a  minimum  by 
replacing  SRUs  on  an  LRU  only  when  that  action  will  make  the  LRU  serviceable.  A 
repaired  SRU  that  would  not  restore  an  LRU  to  serviceable  condition  is  put  on  the 
shelf  until  it,  perhaps  in  combination  with  other  SRUs,  can  be  used  to  fix  an  LRU.  It 
is  easy  to  show  that  such  a  policy  is  better  than  replacing  SRUs  without  regard  to 
their  effect  on  LRU  condition.  What  is  interesting  is  that  for  many  combinations  of 
stock  levels,  the  LRU  backorders  under  this  "opportunistic”  policy  are  not  substan¬ 
tially  greater  than  those  under  a  full  cannibalization  policy.  This  is  demonstrated  in 
the  examples  below. 

APPROXIMATE  MATHEMATICAL  MODEL 

We  consider  the  case  of  full  cannibalization  of  SRUs  belonging  to  a  single  LRU 
at  a  base.  When  LRU  disassembly  and  fault  isolation  are  completed,  those  SRUs 
that  have  failed  are  identified.  If  spare  SRUs  are  available  on  the  shelf  or  can  be 
cannibalized  from  LRUs  that  are  missing  other  SRUs,  the  LRU  is  returned  to  a 
serviceable  condition.  Otherwise  the  LRU  will  have  to  wait  until  spare  SRUs 
become  available  to  fill  all  of  its  holes. 

The  steady-state  probability  that  no  LRUs  are  waiting  because  of  SRU 
shortages  is  the  product  over  the  SRUs  of  the  probabilities  that  Si  or  fewer  units  of 
SRU  i  are  in  base  repair  (where  Si  is  the  base  stock  level  for  Item  i);  the  steady-state 
probability  that  y  or  fewer  LRUs  are  waiting  because  of  SRU  shortages,  Q(y),  is  the 
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product  over  the  SRUs  of  the  probabilities  that  Si  +  aiy  or  fewer  units  of  SRU  i  are  in 
base  repair: 


Q(y)  =  n  Pi(Si  +  y  =  0,1.2,...  [Eq.2-1] 

i 

where  Pi(si  +  aiy)  is  the  steady-state  probability  of  Si  +  aiy  or  less  units  of  the  ith 
SRU  in  repair,  and  ai  is  the  number  of  applications  of  SRU  i  in  the  LRU.  When 
demand  is  Poisson,  these  state  probabilities  are  Poisson,  independent  of  the  shape  of 
the  repair  distribution.2  However,  Equation  2-1  is  not  strictly  correct,  because  the 
SRU  state  probabilities  are  not  independent  of  each  other.  This  is  because  of  the 
assumption  that  when  an  SRU  failure  is  detected,  there  may  be  other  SRU  failures 
detected  simultaneously. 

The  probability  of  exactly  y  LRUs  waiting  because  of  SRU  shortages,  denoted 
by  S(y),  is  the  probability  of  y  or  less  minus  the  probability  of  y  —  1  or  less: 

S(y)  =  Q(y)-Q(y-1)  y=l,2,3,...  [Eq.2-2] 

S(0)  =  Q(0) 

The  final  objective  is  to  compute  the  probability  distribution  that  there  are  y 
LRU  backorders  because  of  either  LRUs  in  disassembly  and  fault  isolation  on  the 
one  hand  or  LRUs  whose  repair  is  delayed  by  SRU  shortages  on  the  other  hand.  The 
former  are  Poisson  probabilities  (denoted  by  L)  with  mean  mTo,  where  m  is  the 
demand  rate.  To  is  the  mean  time  for  LRU  fault  isolation  and  reassembly  after  SRU 
repair.  This  follows  from  Palm’s  theorem,  and  the  observation  that  the  two  parts  of 
the  Poisson  process  (LRU  checkout  and  SRU  repair)  are  independent  since  they  are 
displaced  in  time. 


2For  any  SRU  with  a  QPA  greater  than  one,  the  demand  process  is  compound  Poisson.  In 
such  a  process,  demands  appear  in  clusters.  If  an  individual  unit  of  the  SRU  has  a  failure 
probability  p  and  the  QPA  is  n,  the  compounding  distribution  for  the  size  of  the  cluster  is  binomial 
with  mean  np  and  variance  np(l-p).  The  extended  form  of  Palm’s  theorem  shows  that  the  state 
probabilities  are  compound  Poisson  if  the  same  repair  time  is  drawn  for  each  failed  unit  of  the  SRU 
cluster  —  a  somewhat  unrealistic  assumption. 

Even  though  the  compounding  distribution  is  simple,  the  state  probabilities  for  the  number 
of  demands  in  an  interval  of  time  are  not.  Many  references  show  that  the  mean  of  the  compound 
Poisson  process  for  the  SRU  is  the  Poisson  mean  multiplied  by  np  and  the  variance-to-mean  is  the 
second  moment  of  the  compounding  distribution  divided  by  the  first  moment,  or  1  -I-  (n-  Dp.  As  in 
VARI-METRIC  [3),  we  have  adopted  the  simplest  procedure  for  approximating  the  state 
probabilities  by  using  these  two  parameters  to  define  a  negative  binomial  distribution. 
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If  the  LRU  stock  level  is  so.  the  number  of  backordered  LRUs  resulting  from 
LRU  disassembly  and  fault  isolation,  L(y),  or  from  SRU  backorders  is  obtained  by 
convolution: 


B(y)=  ^  L(*)S(y+s^-*)  y=  1,2,3,...  [Eq.2-3] 

and  the  LRU  EBOs  are: 

E[B(y)l  =  2]  yB(y)- 
y 

As  noted  above,  Equation  2-4  does  not  give  an  exact  solution  to  our  problem 
because  the  Fs  in  Equation  2-1,  although  Poisson  because  of  Palm’s  theorem,  are  not 
independent.  If  independence  is  assumed  for  computational  purposes,  the  EBOs  will 
be  overstated.  Thus,  this  procedure  will  produce  an  upper  bound  for  the  true  solu¬ 
tion. 


We  can  compute  a  lower  bound  as  well  by  using  just  one  SRU  (since  the 
backorders  with  many  SRUs  must  be  at  least  as  large).  We  have  found  that  when 
failure  probabilities  or  repair  times  vary  from  one  SRU  to  another,  using  the  average 
pipeline  value  (i.e.,  LRU  demand  rate  times  the  average  conditional  probability  of 
failure  times  the  average  SRU  repair  time)  is  appropriate,  though  not  a  strict  lower 
bound. 

In  the  next  section  (Results),  we  show  that  the  difference  between  the  lower  and 
upper  bound  increases  as  the  number  of  SRUs  increases  and  as  the  sum  of  the  SRU 
failure  probabilities,  PSUM,  increases.  When  PSUM  is  10,  the  simulated  LRU  EBOs 
under  cannibalization,  CAN,  tends  to  be  about  55  percent  of  the  distance  from  the 
lower  bound  to  the  upper  bound.  As  PSUM  decreases,  CAN  is  a  larger  percentage  of 
the  distance  between  the  bounds.  Using  regression,  we  found  that  the  best  fit  was 
obtained  with: 


F  -  0.812  -  0.1 14  Log(PSUM).  [Eq.  2-5] 

where  F  is  the  fraction  of  the  difference  between  the  lower  and  upper  bound.  Since 
the  true  solution  always  lies  in  the  interval,  F  should  be  constrained  between  0  and  1 
(we  actually  constrained  it  between  0.2  and  0.8).  With  120  data  points  the  statistical 
fit  was  extremely  good  and  the  coefficient  values  highly  significant. 
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We  made  one  alteration  in  the  computation  of  upper  bounds.  When  there  are 
several  SRUs  and  each  has  a  conditional  probability  of  failure  equal  to  one,  the 
upper  bound  is  not  very  close  to  the  solution,  CAN.  This  is  because  the  upper  bound 
assumes  independence,  though  in  fact  there  is  total  dependence  between  the  SRUs. 
Instead  of  computing  the  upper  bound  using  all  N  SRUs,  a  better  upper  bound  is 
obtained  by  using  only  N/2.  This  is  because  of  our  use  of  the  Erlang-4  distribution 
for  repair  times,  which,  as  noted  in  Table  2-1,  gives  waiting  times  that  are  about  half 
way  between  the  constant  (single  SRU)  and  exponential  (N  SRUs)  cases. 

RESULTS 

The  numerical  results  of  applying  the  approximation  techniques  described  are 
shown  in  Table  2-2.  The  headings  for  each  group  of  cases  give  the  number  of  SRUs, 
the  daily  demand  rate  (DDR)  for  the  LRU,  and  the  regression  value  F  expressed  as  a 
percent.  Next  we  show  the  conditional  failure  probability  for  each  SRU  (i.e.,  given 
that  the  LRU  has  just  failed,  the  probability  that  a  particular  SRU  has  failed),  and 
the  average  repair  time  for  the  LRU  and  each  SRU.  The  repair  distribution  for  each 
item  is  assumed  to  be  Erlang-4.  PSUM,  the  sum  of  the  SRU  failure  probabilities,  is 
also  shown  because  it  is  the  independent  variable  in  the  regression  adjustment. 

Each  line  in  the  table  provides  the  results  for  one  case: 

•  Case  #.■  Case  number  for  reference. 

•  ALOW;  Analytic  lower  bound  for  the  LRU  EBOs  from  the  model. 

•  A  UP:  Analytic  upper  bound  for  the  LRU  EBOs  from  the  model. 

•  EST:  Estimated  solution  for  the  LRU  EBOs  obtained  by  using  the  regres¬ 
sion  formula  for  the  fraction  of  the  distance  between  ALOW  and  AUP. 

•  CAN:  Simulation  result  for  the  LRU  EBOs  under  cannibalization. 

•  %ERR:  The  percent  error  which  is  100(EST—CAN)/CAN. 

•  DELTA:  The  delta  value  to  be  added  and  subtracted  to  CAN  to  obtain  the 
95  percent  confidence  interval  for  the  LRU  EBOs  from  the  simulation.  If 
EST  falls  outside  the  confidence  limits  for  CAN,  an  asterisk  is  placed  after 
the  %  ERR  to  indicate  a  statistically  significant  difference. 

•  MLOW:  The  METRIC  lower  bound  estimate  of  LRU  EBOs.  In  those  cases 
for  which  PSUM  is  more  than  one,  the  SRU  failure  probabilities  are 
normalized  so  that  they  add  to  one. 


11 


•  MUP:  The  METRIC  upper  bound  estimate  of  LRU  EBOs,  without  normal¬ 
izing  the  SRU  failure  probabilities.  Note  that  when  PSUM  is  one  or  less, 
MUP=MLOW. 

•  NOCAN:  Simulation  result  for  the  LRU  EBOs  with  no  caimibalization,  but 
with  the  opportunistic  replacement  policy  described  in  the  Simulation 
section.  Note  that  these  values  equal  or  exceed  CAN. 

•  Stock  Levels:  Stock  levels  for  the  LRU  and  each  SRU. 

The  cases  in  Table  2-2  are  in  ascending  order  of  PSUM,  the  sum  of  the  SRU  failure 
probabilities,  except  that  the  Gnal  cases,  116  through  120,  have  multiple  applica¬ 
tions  (QPA)  of  some  SRUs  in  the  LRU. 

Before  discussing  the  results,  some  general  comments  on  the  evaluation 
procedures  are  in  order.  Our  primary  interest  is  the  average  of  the  absolute  values  of 
the  percent  errors  over  the  cases.  This  is  because  an  error  of  0.1  backorders  is  more 
important  if  the  true  value  is  0.2  than  if  the  true  value  is  20. 

The  METRIC  estimate  was  obtained  by  averaging  the  upper-  and  lower-bound 
estimates.  This  was  the  best  METRIC  procedure  we  could  find,  although  it  is  clearly 
not  very  good  for  several  reasons: 

•  When  the  SRU  failure  probabilities  sum  to  one  or  less,  the  lower  and  upper 
bounds  are  identical.  Thus,  they  do  not  bound  the  solution. 

•  When  the  SRU  failure  probabilities  sum  to  more  than  one  so  that  the 
bounds  are  different,  they  are  often  very  far  apart.  Even  though  these 
bounds  are  far  apart,  they  fail  to  include  the  solution  in  10  cases. 

Though  it  may  be  possible  to  obtain  a  better  estimate  from  METRIC,  we  did  not 
attempt  to  do  so  because  the  recommended  procedure  is  so  much  better.  In  that 
sense,  METRIC  is  used  only  as  a  strawman  for  comparison  purposes.  That  is  why  we 
show  only  the  average  METRIC  error  rather  than  the  METRIC  error  for  each 
individual  case.  The  METRIC  error  is  relevant,  though,  because  METRIC  is  used  to 
model  applications  now. 

The  cases  in  Table  2-2  were  chosen  to  provide  an  interesting  range  of  situations 
with  1  to  20  SRUs,  PSUMs  from  0.1  to  10,  demand  rates  of  0.1/day  and  1/day,  and 
SRU  repair  times  of  5  to  20  days.  The  average  absolute  percentage  error  depends  on 
the  mix  of  cases  chosen  and  should  only  be  considered  as  an  indication  of  the 
adequacy  of  our  approach. 
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SIMULTANEOUS  DETECTION  CASES 


1  52  0  040  0  905  0  905  0  9B7 


SIMULTANEOUS  DETECTION  CASES  (Continued) 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 


19  682  18  355  19  090  -  3.85*  0.296  24000  37.000  19.399  2 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 


SIMULTANEOUS  OCTECnOM  CASES  (Continued) 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 


216  2,518  1  931  1  918  0  70  0016  1  500  12660  2  260 
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SIMULTANEOUS  DETECTION  CASES  (Continued) 


DISCUSSION  OF  RESULTS 


The  average  absolute  percent  error  from  our  technique  —  3.57  percent  - 
compares  very  favorably  with  the  306  percent  error  from  METRIC.  Our  maximum 
error  of  — 17.86  percent  is  a  good  deal  larger  than  our  3.57  percent  average. 
However,  the  METRIC  error  exceeds  100  percent  in  40  of  the  120  cases  and  was  less 
than  ours  in  only  7  cases. 

One  problem  with  average  absolute  percent  error  is  that  a  few  very  large  errors 
can  distort  the  conclusion.  Since  the  true  value  (CAN)  is  in  the  denominator  of 
percent  error,  a  few  large  percent  errors  can  easily  occur,  especially  when  CAN  is 
small.  An  alternative  assessment  using  percent  error  can  be  obtained  by  summing 
the  absolute  errors  and  then  dividing  by  the  sum  of  the  true  values.  This  average 
error  must  be  smaller,  but  the  important  question  is  how  our  error  and  the  METRIC 
error  compare.  These  errors  of  2.31  percent  and  93.43  percent,  respectively,  still 
show  a  ratio  of  approximately  1  to  40. 

We  also  observe: 

•  Our  errors  are  smallest  for  cases  in  which  PSUM  is  less  than  or  equal  to  one. 
This  is  true  for  METRIC  as  well,  but  as  demand  rates  increase  (e.g., 
Cases  21  and  37)  the  METRIC  error  also  becomes  very  large. 

•  The  error  in  56  of  our  120  cases  falls  inside  the  95  percent  confidence 
interval  from  the  simulation 

•  Our  largest  percentage  errors  of  -17.86  for  Case  100  and  —16.16  for 
Case  112  are  small  backorder  errors  of 0.009  and  0.005,  respectively. 

•  When  all  SRU  failure  probabilities  are  one  (Cases  88  through  100),  the 
expected  backorders  wi^out  cannibalization,  NOCAN,  are  identical  with 
CAN.  This  is  true  for  any  repair  distribution,  and  is  due  to  the  oppor¬ 
tunistic  replacement  policy  under  NOCAN. 

•  In  the  other  cases,  NOCAN  is  usually  only  slightly  larger  than  CAN.  There 
are  notable  exceptions,  however.  In  Case  37  NOCAN  is  86  percent  larger 
than  CAN,  in  Case  104  NOCAN  is  almost  twice  as  large,  and  in  Case  112 
NOCAN  is  more  than  twice  as  large.  The  similarity  among  these  cases  is 
the  high  LRU  stock  level  and  low  SRU  stock  levels.  The  analytic  upper 
bounds  are  below  NOCAN  in  the  first  two  of  these  cases.  It  is  because  of 
cases  such  as  these  and  the  fact  that  our  analytic  model  is  based  on  the 
cannibalization  assumption  that  we  chose  to  focus  on  the  full  canni- 
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balization  case  in  this  study.  Note  that  in  each  of  the  120  cases  CAN  lies 
between  the  bounds  of  ALOW  and  AUP. 

We  have  assumed  Erlang-4  repair  distributions  for  the  SRUs.  It  may  be  of 
interest  to  assess  the  sensitivity  of  backorders  to  that  assumption.  The 
solution  for  constant  SRU  repair  times  lies  between  the  lower  analytic 
bound  and  CAN;  that  for  exponential  repair  times  lies  between  CAN  and 
the  upper  analjrtic  bound.  For  example,  for  Cases  88  through  90  the 
constsmt  solutions  are  identical  with  the  lower  bound  (because  each  SRU 
fails  every  time),  and  the  exponential  solutions  are  1.841, 1.803,  and  1.803, 
respectively. 


CHAPTERS 


SEQUENTIAL  FAILURE  DETECTION 

MATHEMATICAL  MODEL 

As  noted  in  the  Introduction,  we  assumed  in  the  sequential  failure  detection 
case  that  after  some  LRU  checkout  time,  a  diagnosis  is  made  of  the  first  failed  SRU. 
If  a  spare  SRU  is  available,  it  is  installed  on  the  LRU  and  the  testing  continues  to 
find  the  next  (if  any)  failed  SRU.  However,  if  a  spare  is  not  available  for  the  failed 
SRU,  the  diagnosis  of  the  next  failed  SRU  is  delayed. 

Again,  we  assume  Erlang-4  repair  times.  In  this  sequential  detection  case,  the 
mean  and  variance  for  the  number  of  LRUs  in  repair  are  given  by  the  VARI- 
METRIC  approximations  [3]; 

N 

E(*)  =  \R  +  y  EtB(8,)]  [Eq.3-1] 

o  O  O  J 

J=l 

N 

Viir(x)  =  AR  +  y  Var{B(8,)l  [Eq.3-2) 

o  O  0  «l 

J=1 

where  is  the  LRU  demand  rate,  assumed  to  be  Poisson,  and  Rq  is  the  average  LRU 
checkout  time.  Since  any  SRU  with  a  backorder  delays  an  LRU  in  this  sequential 
case,  it  is  appropriate  to  add  the  expected  SRU  backorders  to  obtain  the  total  number 
of  LRUs  in  repair. 

However,  the  simulation  output  shows  that  Equations  3-1  and  3-2  tend  to 
overstate  the  number  of  LRUs  in  repair  when  the  SRU  stock  levels  are  positive, 
particularly  when  there  are  many  SRU  failures  (e.g.,  10  or  so).  The  simulation 
shows  that  in  cases  with  several  SRUs  in  sequence  and  positive  SRU  stock  levels,  the 
probability  distribution  of  the  number  in  repair  for  the  last  SRUs  in  sequence  is  no 
longer  Poisson.  A  degree  of  regularity  has  been  imposed  on  the  demand  process  such 
that  the  variance-to-mean  ratios  have  dropped  below  the  value  of  one  for  a  Poisson  to 
values  in  the  0.8  -0.9  range.  The  positive  SRU  stocks  act  as  a  buffer  in  the  demand 
process  for  the  last  SRUs  in  the  sequence. 
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In  the  Appendix,  we  demonstrate  procedures  for  calculating  Erlang  state 
probabilities  for  any  specified  variance-to-mean  ratio  less  than  one.  If  we  knew  the 
appropriate  variance-to-mean  ratio  for  each  SRU,  we  could  calculate  the  expected 
backorders  and  the  variance  more  accurately  (i.e.,  the  last  term  in  Equations  3-1  and 
3-2,  respectively).  The  problem  is  that  the  variance-to-mean  ratio  for  a  particular 
SRU  is  hard  to  predict  —  it  depends  on  the  stock  levels  of  the  previous  SRUs  in  a 
rather  complicated  way  .3 

However,  Equations  3-1  and  3-2  do  give  a  good  upper  bound.  We  have  found 
empirically  that  a  reasonable  lower  bound  is  obtained  by  using  Equations  3-1  and 
3-2  with  half  of  the  SRUs.  In  cases  in  which  the  SRUs  have  different  repair  times, 
failure  probabilities,  or  stock  levels,  we  suggest  that  the  SRUs  for  the  lower  bound  be 
those  with  the  largest  SRU  EBOs.  That  procedure  w£is  followed  in  the  cases  shown 
in  Table  3-1. 

As  in  the  simultaneous  failure  detection  cases,  the  difference  between  the 
lower  and  upper  bound  increases  as  the  number  of  SRUs  increases  and  as  the  sum  of 
the  SRU  conditional  probabilities,  PSUM,  increases.  When  PSUM  is  10,  the  LRU 
EBOs  tends  to  be  about  67  percent  of  the  distance  from  the  lower  bound  to  the  upper 
bound.  As  PSUM  decreases,  the  LRU  EBOs  are  a  larger  percentage  of  the  distance 
between  the  bounds.  Using  regression  we  found  that  the  best  fit  was  obtained  with: 

F  =  1.126  -  0.196  LoglPSUM),  f®**'  ® 

where  F  is  the  fraction  of  the  difference  between  the  lower  and  upper  bound. 

RESULTS 

The  numerical  results  of  applying  the  approximation  techniques  described  are 
shown  in  Table  3-1.  The  headings  for  each  group  of  cases  are  identical  with  those  in 
Table  2-2.  The  cases  in  this  table  were  selected  from  the  set  of  120  cases  in  Table  2-2. 
We  have  used  a  much  smaller  number  of  cases  here,  concentrating  on  those  with 
large  values  of  PSUM  where  the  differences  are  greatest 


3The  LRU  EBOs  depend  on  the  order  in  which  the  SRU  failures  are  detected.  An  SRU  with  a 
small  stock  level  will  delay  more  LRUs  if  it  is  one  of  the  first  in  the  detection  sequence. 
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Each  line  in  the  table  provides  the  results  for  one  case: 

•  Case  #:  Case  number  for  reference;  agrees  with  those  in  Table  2-2. 

•  VLOW:  VARI-METRIC  lower  bound  analytic  solution  for  the  LRU  EBOs. 

•  VUP:  VARI-METRIC  upper  bound  analsrtic  solution  for  the  LRU  EBOs. 

•  EST:  Estimated  solution  for  the  LRU  EBOs  obtained  by  using  the  regres¬ 
sion  formula  for  the  fraction  of  the  distance  between  VLOW  and  VUP. 

•  SIM:  Simulation  result  for  the  LRU  EBOs  under  sequential  SRU  failure 
detection. 

•  %  ERR:  The  percent  error  which  is  100(EST-SIM)/SIM. 

•  DELTA:  The  delta  value  to  be  added  and  subtracted  to  SIM  to  obtain  the 
95  percent  confidence  interval  for  the  LRU  EBOs  from  the  simulation.  If 
EST  falls  outside  the  confidence  limits  for  SIM,  an  asterisk  is  placed  after 
the  %  ERR  to  indicate  a  statistically  significant  difference. 

•  Stock  Levels:  Stock  levels  for  the  LRU  and  each  SRU. 

The  cases  in  Table  3-1  are  presented  in  ascending  order  of  PSUM,  the  sum  of 
the  SRU  failure  probabilities,  except  that  the  final  cases,  116  through  120,  have 
multiple  applications  (QPA)  of  some  SRUs  in  the  LRU. 

By  way  of  comparison  with  Table  2-2  for  simultaneous  detection,  we  should 
note  that  there  is  nothing  like  cannibalization  in  this  table.  If  cannibalization  were 
possible,  the  situation  would  be  similar  to  simultaneous  detection.  Also  the  values  of 
VLOW  and  VUP  are  similar  to,  but  slightly  larger  than,  MLOW  and  MUP  in 
Table  2-2.  The  VARI-METRIC  lower  bound  is  set  equal  to  the  upper  bound  when¬ 
ever  PSUM  is  one  or  less  and  whenever  all  SRU  stock  levels  are  zero. 

DISCUSSION  OF  RESULTS 

The  average  absolute  percent  error  from  our  technique  —  6.30  percent  —  is 
much  larger  than  the  3.57  percent  obtained  in  Table  2-2.  This  is  due  in  part  to  our 
concentration  on  cases  with  large  values  of  PSUM  and  QPAs  greater  than  one. 
However,  the  maximum  percent  errors  are  larger  as  well. 

The  comparable  VARI-METRIC  error,  using  the  upper  bound,  is  9.62  percent. 
This  is  only  50  percent  larger.  However,  the  maximum  error  of  57.2  percent  is 
reduced  to  32.9  percent  with  our  technique. 
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SEQUENTIAL  DETECTION  CASES 


SEQUENTIAL  DETECTION  CASES  (Continued) 
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SEQUENTIAL  DETECTION  CASES  (Continued) 


SEQUENTIAL  DETECTION  CASES  (Continued) 
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The  fullowing  observations  are  also  made: 

•  The  error  in  19  of  our  35  cases  falls  inside  the  95  percent  confidence  interval 
from  the  simulation. 

•  A  number  of  our  larger  errors  are  in  Cases  116  through  120,  those  with 
QPAs  greater  than  one.  Those  cases  are  particularly  complex  because  the 
SRU  demand  is  compound  Poisson  with  variance-to-mean  ratios  that 
decrease  from  one  SRU  to  the  next  in  the  detection  sequence. 

•  When  the  SRU  stock  levels  are  zero,  the  LRU  EBOs  are  independent  of  the 
SRU  repair  distribution  shape  (Palm’s  theorem  holds).  When  the  SRU  stock 
levels  are  positive,  the  LRU  EBOs  are  largest  when  the  SRU  repair 
distributions  are  exponential  and  smallest  when  they  are  constant.  For 
example.  Case  97  has  a  large  error  and  LRU  EBOs  of  0.507  under  Erlang-4 
repair.  The  backorders  are  0.690  under  exponential  and  0.211  under 
constant  repair.  In  the  latter  case,  the  variance-to-mean  ratio  for  the 
number  in  repair  of  the  last  SRU  is  only  0.66. 

•  The  simulation  solution  is  sometimes  below  the  lower  bound  (e.g.. 
Cases  105,  111,  and  118);  in  the  cases  with  QPAs  greater  than  one,  the 
simulation  solution  is  sometimes  above  the  upper  bound  (e.g..  Cases  116 
and  117). 

•  A  comparison  of  Tables  2-2  and  3-1  shows  the  expected  backorders  are 
always  greater  in  the  latter.  The  difference  is  smallest  for  low  demand 
rates,  ample  stock,  and  small  values  of  PSUM.  The  largest  percentage 
difference  is  in  Case  112  with  values  of  0.028  and  7.077  backorders  in 
Tables  2-2  and  3-1,  respectively.  Other  large  differences  are  Cases  91, 101, 
and  117  for  which  Table  3-1  has  values  about  five  times  as  large  as  those  in 
Table  2-2. 
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CHAPTER 4 


CONCLUSIONS 


We  have  demonstrated  that  the  LRU  EBOs  are  much  larger  in  the  sequential 
SRU  failure  detection  case  than  in  simultaneous  detection.  Obviously  it  is  desirable 
for  maintenance  to  strive  for  the  latter  t3rpe  of  detection  process  whenever  possible. 
Simple  approximate  computational  formulas  that  have  been  developed  for  both  types 
of  detection  appear  to  give  reasonably  accurate  results. 

If  a  model  such  as  VARI-METRIC  is  used  without  the  suggested  modifications, 
there  are  several  implications.  In  the  simultaneous  detection  cases,  the  LRU  EBOs 
and  the  spares  requirements  will  be  overstated,  dramatically  in  many  cases.  Thus, 
we  would  tend  to  buy  too  much  stock  although  the  proportion  of  budget  spent  on 
SRUs  will  tend  to  be  about  right. 

In  the  sequential  detection  cases,  the  LRU  EBOs  and  the  spares  requirements 
will  be  overstated  but  by  a  smaller  amount.  However,  because  VARI-METRIC  gives 
the  correct  answers  for  zero  SRU  stock  levels  and  overestimates  backorders  when 
SRUs  are  in  stock,  it  underestimates  the  value  of  SRU  stock.  Thus,  the  proportion  of 
budget  spent  on  SRUs  will  tend  to  be  smaller  than  optimal. 
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APPENDIX 


COMPUTATION  OF  ERLANG  STATE  PROBABILITIES 


Since  Erlang-4  repair  distributions  have  been  used  in  our  modeling,  we  wanted 
to  be  able  to  compute  Erlang  state  probabilities,  i.e.,  the  probability  distribution  for 
the  niunber  of  events  in  any  fixed  time  period  where  the  time  between  events  has  an 
Erlang  distribution.  Those  probabilities  are  necessary  to  model  the  sequential  fail¬ 
ure  detection  process  more  accurately.  They  are  useful  in  other  modeling  applica- 
I  tions  as  well,  and  we  have  not  seen  these  methods  described  elsewhere. 

We  begin  with  the  Erlang-4  and  then  extend  the  results  to  the  general  case. 
Consider  a  Poisson  process  and  multiply  the  mean  by  four.  The  time  between 
successive  events  is  exponentially  distributed,  of  course,  and  the  time  between  every 
set  of  four  events  is  Erlang-4.  Our  objective  is  to  compute  the  state  probabilities  for 
these  latter  Erlang  events. 

To  compute  the  state  probabilities  we  must  relate  the  Poisson  probabilities  for 
the  original  process  to  those  of  the  Erlang.  For  example,  suppose  that  we  observe  no 
events  in  the  Poisson  process;  then  there  were  no  Erlang  events.  If  we  observe  four 
Poisson  process  events,  there  must  have  been  one  Erlang  event  since  every  fourth 
event  is  Erlang. 

1 

The  problem  arises  when  we  observe  a  number  of  Poisson  events  not  precisely 
divisible  by  four.  For  example,  if  we  observe  one  Poisson  event,  it  may  or  may  not  be 
an  Erlang  event,  depending  on  our  counting  origin.  With  a  random  origin,  there  is  a 
probability  1/4  that  it  is  an  Erlang  event.  The  general  relationship  for  the  Erlang-4 
state  probabilities  with  mean,  M,  e(i|M),  is: 

e(0|M)  = 
e(i|M)  = 


P(0|4M)  +  0.75p(l|4M)  +  0.5p(2|4M)  +  0.26p(3|4M)  f®*** 

0.25p(4i-  3KM)  +  0.5p(4i-  2|4M)  +  0.75p(4i-  l|4M) 

+  p(4i|4M)  0.75p(4i-(-  1|4M)  +  0.5p(4i+2|4M) 

-t-  0.23p(4i+3|4M)  i=  1.2,... 


t 
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where  the  p’s  are  Poisson  probabilities  with  a  mean  of  4M.  It  is  easy  to  verify  that 
the  e’s  Slim  to  one,  and  thus  comprise  a  valid  probability  distribution.  It  is  also  easy 
to  check  that  the  mean  of  the  e’s  is  M  (when  each  equation  for  e(i)  is  multiplied  by  i 
and  summed,  the  coefficient  of  each  Poisson  probability,  p(n),  is  n/4).  The  Erlang 
variance  is  a  simple  function  of  the  p’s  also,  and  it  will  always  be  less  than  the  mean. 

The  result  is  a  simple  analytic  computation  for  Erlang  state  probabilities  from 
Poisson  probabilities.  While  the  physical  model  for  the  Erlang-k  is  based  on  the  kth 
exponential  event  where  k  is  integral,  the  state  probabilities  can  be  calculated  for 
nonintegral  k  as  well.  This  allows  us  to  model  any  variance-to-mean  ratio  less  than 
one.  The  generalized  version  of  Equations  A- 1  and  A-2  for  nonintegral  k  can  be 
written,  but  the  expression  for  the  coefficients  is  very  complicated.  For  computa¬ 
tional  purposes,  it  is  more  useful  to  provide  the  equations  for  the  first  three  Erlang 
probabilities,  where  K  is  used  to  denote  [k],  the  integer  less  than  or  equal  to  k: 

e(0|M)  =  p(0|kM)  +  p(l|kM)(k-iyk  +  p(2|kM)  (k-2Vk  lEq.  A-31 

.  .  .  +  p(K|kM)(k-K)/k 

e(l|M)  =  pdIkMVk  +  p(2|kM)(2/k)  . .  .  +  p(KlkM)(K/k)  [Eq.  A-4] 

+  p(K+l|kM)(2k-K-lVk  ...  +  p(2K|kM)(2k>2K)Ac 

e(2|M)  =  p(K+l|kM)(l-k  +  Kyk  .  .  .  +p(2K|kM)(2K-kVk  tEq.A-61 

+  p(2K+l|kM)(3k-2K-l)/k  ...  +  p(3K(kM)(3k-3K)/k 


The  general  pattern  can  be  inferred  easily,  noting  that  the  numerators  of  the 
successive  coefficients  in  an  equation  increase  by  1  to  the  value  K/k  and  then 
decrease  by  1  (modulo  k).  Any  probability  p(x)  is  in  one  equation  or  two  equations, 
depending  on  whether  k  is  integral,  and  the  sum  of  its  coefficients  is  1.  Thus,  the  e’s 
are  a  probability  distribution  since  the  p’s  are  a  probability  distribution.  When  the 
Erlang  probabilities  e(i)  in  Equations  A-3  through  A-5  are  multiplied  by  i  and 
summed,  each  Poisson  probability,  p(x),  has  a  coefficient  x/k  showing  that  the  mean 
of  the  Erlang  is  M. 
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